BinPro: A Tool for Binary Source Code Provenance

نویسندگان

  • Dhaval Miyani
  • Zhen Huang
  • David Lie
چکیده

Enforcing open source licenses such as the GNU General Public License (GPL), analyzing a binary for possible vulnerabilities, and code maintenance are all situations where it is useful to be able to determine the source code provenance of a binary. While previous work has either focused on computing binary-to-binary similarity or source-to-source similarity, BinPro is the first work we are aware of to tackle the problem of source-to-binary similarity. BinPro can match binaries with their source code even without knowing which compiler was used to produce the binary, or what optimization level was used with the compiler. To do this, BinPro utilizes machine learning to compute optimal code features for determining binaryto-source similarity and a static analysis pipeline to extract and compute similarity based on those features. Our experiments show that on average BinPro computes a similarity of 81% for matching binaries and source code of the same applications, and an average similarity of 25% for binaries and source code of similar but different applications. This shows that BinPro’s similarity score is useful for determining if a binary was derived from a particular source code.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BinPro: A Tool for Binary Backdoor Accountability in Code Audits by Dhaval Miyani A thesis submitted in conformity with the requirements

BinPro: A Tool for Binary Backdoor Accountability in Code Audits Dhaval Miyani Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2016 Highly security sensitive organizations often perform source code audits on software they use. However, after the audit is performed, they must still perform a binary code audit to ensure the binary provide...

متن کامل

Facilitating Trust on Data through Provenance

Research on trusted computing focuses mainly on the security and integrity of the execution environment, from hardware components to software services. However, this is only one facet of the computation, the other being the data. If our goal is to produce trusted results, a trustworthy execution environment is not enough: we also need trustworthy data. Provenance of data plays a pivotal role in...

متن کامل

Exploring the Evolution and Provenance of Git Versioned RDF Data

The distributed character and the manifold possibilities for interchanging data on the Web lead to the problem of getting hold of the provenance of the data. Especially in the domain of digital humanities and when dealing with Linked Data in an enterprise context provenance information is needed to support the collaborative process of data management. We are proposing a possibility for capturin...

متن کامل

Measuring COOP Attack Surface Reduction

Nowadays control-flow hijacking attacks represents the highest software-based security threat [16]. We want to develop a tool that can measure the exact attack surface reduction w.r.t. the attack, Counterfeit Object-Oriented Programming (COOP) [8]. This attack is particularly hard to defend against since traditional Control Flow Integrity (CFI) [1] approaches and hardware based shadow stacks [1...

متن کامل

Looking Inside the Black-Box: Capturing Data Provenance Using Dynamic Instrumentation

Knowing the provenance of a data item helps in ascertaining its trustworthiness. Various approaches have been proposed to track or infer data provenance. However, these approaches either treat an executing program as a black-box, limiting the fidelity of the captured provenance, or require developers to modify the program to make it provenance-aware. In this paper, we introduce DataTracker, a n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1711.00830  شماره 

صفحات  -

تاریخ انتشار 2017